AITopics | topic taxonomy

Collaborating Authors

topic taxonomy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Self-supervised Topic Taxonomy Discovery in the Box Embedding Space

Lu, Yuyin, Chen, Hegang, Mao, Pengbo, Rao, Yanghui, Xie, Haoran, Wang, Fu Lee, Li, Qing

arXiv.org Artificial IntelligenceAug-27-2024

Topic taxonomy discovery aims at uncovering topics of different abstraction levels and constructing hierarchical relations between them. Unfortunately, most of prior work can hardly model semantic scopes of words and topics by holding the Euclidean embedding space assumption. What's worse, they infer asymmetric hierarchical relations by symmetric distances between topic embeddings. As a result, existing methods suffer from problems of low-quality topics at high abstraction levels and inaccurate hierarchical relations. To alleviate these problems, this paper develops a Box embedding-based Topic Model (BoxTM) that maps words and topics into the box embedding space, where the asymmetric metric is defined to properly infer hierarchical relations among topics. Additionally, our BoxTM explicitly infers upper-level topics based on correlation between specific topics through recursive clustering on topic boxes. Finally, extensive experiments validate high-quality of the topic taxonomy learned by BoxTM.

boxtm, taxonomy, topic taxonomy, (16 more...)

arXiv.org Artificial Intelligence

2408.1505

Country:

Asia > China > Hong Kong (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)
(2 more...)

Add feedback

Using Zero-shot Prompting in the Automatic Creation and Expansion of Topic Taxonomies for Tagging Retail Banking Transactions

Moraes, Daniel de S., Santos, Pedro T. C., da Costa, Polyana B., Pinto, Matheus A. S., Pinto, Ivan de J. P., da Veiga, Álvaro M. G., Colcher, Sergio, Busson, Antonio J. G., Rocha, Rafael H., Gaio, Rennan, Miceli, Rafael, Tourinho, Gabriela, Rabaioli, Marcos, Santos, Leandro, Marques, Fellipe, Favaro, David

arXiv.org Artificial IntelligenceJan-7-2024

This work presents an unsupervised method for automatically constructing and expanding topic taxonomies by using instruction-based fine-tuned LLMs (Large Language Models). We apply topic modeling and keyword extraction techniques to create initial topic taxonomies and LLMs to post-process the resulting terms and create a hierarchy. To expand an existing taxonomy with new terms, we use zero-shot prompting to find out where to add new nodes, which, to our knowledge, is the first work to present such an approach to taxonomy tasks. We use the resulting taxonomies to assign tags that characterize merchants from a retail bank dataset. To evaluate our work, we asked 12 volunteers to answer a two-part form in which we first assessed the quality of the taxonomies created and then the tags assigned to merchants based on that taxonomy. The evaluation revealed a coherence rate exceeding 90% for the chosen taxonomies, while the average coherence for merchant tagging surpassed 80%.

category, taxonomy, topic taxonomy, (17 more...)

arXiv.org Artificial Intelligence

2401.0679

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Topic Taxonomy Expansion via Hierarchy-Aware Topic Phrase Generation

Lee, Dongha, Shen, Jiaming, Lee, Seonghyeon, Yoon, Susik, Yu, Hwanjo, Han, Jiawei

arXiv.org Artificial IntelligenceOct-18-2022

Topic taxonomies display hierarchical topic structures of a text corpus and provide topical knowledge to enhance various NLP applications. To dynamically incorporate new topic information, several recent studies have tried to expand (or complete) a topic taxonomy by inserting emerging topics identified in a set of new documents. However, existing methods focus only on frequent terms in documents and the local topic-subtopic relations in a taxonomy, which leads to limited topic term coverage and fails to model the global topic hierarchy. In this work, we propose a novel framework for topic taxonomy expansion, named TopicExpan, which directly generates topic-related terms belonging to new topics. Specifically, TopicExpan leverages the hierarchical relation structure surrounding a new topic and the textual content of an input document for topic term generation. This approach encourages newly-inserted topics to further cover important but less frequent terms as well as to keep their relation consistency within the taxonomy. Experimental results on two real-world text corpora show that TopicExpan significantly outperforms other baseline methods in terms of the quality of output taxonomies.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2211.01981

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.05)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (0.93)
Media (0.93)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.68)
Leisure & Entertainment > Sports > Basketball (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel Topic Clusters

Lee, Dongha, Shen, Jiaming, Kang, SeongKu, Yoon, Susik, Han, Jiawei, Yu, Hwanjo

arXiv.org Artificial IntelligenceJan-19-2022

Topic taxonomies, which represent the latent topic (or category) structure of document collections, provide valuable knowledge of contents in many applications such as web search and information filtering. Recently, several unsupervised methods have been developed to automatically construct the topic taxonomy from a text corpus, but it is challenging to generate the desired taxonomy without any prior knowledge. In this paper, we study how to leverage the partial (or incomplete) information about the topic structure as guidance to find out the complete topic taxonomy. We propose a novel framework for topic taxonomy completion, named TaxoCom, which recursively expands the topic taxonomy by discovering novel sub-topic clusters of terms and documents. To effectively identify novel topics within a hierarchical topic structure, TaxoCom devises its embedding and clustering techniques to be closely-linked with each other: (i) locally discriminative embedding optimizes the text embedding space to be discriminative among known (i.e., given) sub-topics, and (ii) novelty adaptive clustering assigns terms into either one of the known sub-topics or novel sub-topics. Our comprehensive experiments on two real-world datasets demonstrate that TaxoCom not only generates the high-quality topic taxonomy in terms of term coherency and topic coverage but also outperforms all other baselines for a downstream task.

data mining, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3485447.3512002

2201.06771

Country:

Europe > France (0.14)
North America > United States > Illinois (0.14)
Asia (0.14)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance (1.00)
Leisure & Entertainment > Sports > Soccer (0.93)
(8 more...)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Large Margin Taxonomy Embedding for Document Categorization

Weinberger, Kilian Q., Chapelle, Olivier

Neural Information Processing SystemsDec-31-2009

Applications of multi-class classification, such as document categorization, often appear in cost-sensitive settings. Recent work has significantly improved the state of the art by moving beyond ``flat'' classification through incorporation of class hierarchies [Cai and Hoffman 04]. We present a novel algorithm that goes beyond hierarchical classification and estimates the latent semantic space that underlies the class hierarchy. In this space, each class is represented by a prototype and classification is done with the simple nearest neighbor rule. The optimization of the semantic space incorporates large margin constraints that ensure that for each instance the correct class prototype is closer than any other. We show that our optimization is convex and can be solved efficiently for large data sets. Experiments on the OHSUMED medical journal data base yield state-of-the-art results on topic categorization.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback